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On the Kendall Correlation Coefficient 


A. Stepanov, * Immanuel Kant Baltic Federal University 


Abstract 

In the present paper, we first discuss the Kendall rank correlation coefficient r n . In 
continuous case, we define r n in terms of the concomitants of order statistics, find the 
expected value of r n and show that the later is free of n. We also prove that in con¬ 
tinuous case the Kendall correlation coefficient converges in probability to its expected 
value r = Er n . We then propose to consider r as a new theoretical correlation coeffi¬ 
cient which can be an alternative to the classical Pearson product-moment correlation 
coefficient. At the end of this work we analyze illustrative examples. 

Keywords and Phrases : bivariate distributions; concomitants of order statistics; Pearson 
product-moment correlation coefficient; sample correlation coefficient; Kendall rank correla¬ 
tion coefficient. 


1 Introduction 

Let (A", Y), (Ad, Yi), (X 2 , Y 2 ),..., (X n , Y n ) be independent and identically distributed ran¬ 
dom vectors with bivariate distribution F(x,y) = P(X < x, Y < y) and corresponding 
marginal distributions H(x) = P(X < x) and G(y) = P{Y < y). The purpose of slight 
modifications in the definitions of F, H and G is that in Section 4 we define a new correla¬ 
tion coefficient and we wish to have the same form of this correlation coefficient for different 
types of distributions. If F is an absolutely continuous distribution then the corresponding 
densities will be denoted as f(x, y), h(x) and g(y), respectively. Let A l ri < X 2n < ... < A" n n 
be the order statistics obtained from the sample Xi,X 2 ,..., X n . For these order statistics, 
let us define their concomitants Yji )n ], Y[ 2 , n ], • • •, Yj n ,n]- Let X t = Xj )U . Then Yy :n ] = Yi is the 
concomitant of the order statistic Xj >n . Concomitants of order statistics were proposed by 
David (1973) and Bhattacharya (1974). Concomitants of order statistics were further dis¬ 
cussed in David and Galambos (1974), Bhattacharya (1984), Egorov and Nevzorov (1984), 
David (1994), Goel and Hall (1994), Chu et al. (1999), David and Nagaraja (2003), Bairamov 
and Stepanov (2010) and others. 
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It is known that the rate of dependence between random variables X and Y can be 
measured in terms of the Pearson product-moment correlation coefficient 

E(X — EX)(Y — EY) 


P = 


crx&Y 


The sample correlation coefficient 


Pn 


Z(X t -X)(Y-Y) 

i= 1 

~n ~ n 

Y / (x i -xyz(Y i -Yy 


i =1 


i —1 


P 

is a good approximation for p for large values of n since p n —> p\ see, for example, Fisher 
(1921). 

Let Y [1>n] ,..., Y[ ri;ri ] be the concomitants of order statistics X ltn < ... < X n n . Let 


j -1 


3 -1 


r j,n 


J 0 .]} (1 <i<j<n) 


i— 1 


2=1 


be the rank of the concomitant Y ^ >n ] among the concomitants Yji )Tl ],..., The following 
value 

4V n r a n j ~ 1 

^Z^,j=2 l 3,n 1 4 r i 

"Pi / 1 / y / y P,j,n 1 

n(n — 1) n(n — 1) z —' 

is known as Kendall’s rank correlation coefficient (see, for example, Kendall (1970)). This 
rank correlation coefficient T n can be used along with p n . Sometimes samples X t and Yj are 
not known but the ranks r^ n are known. In this case one can use r n instead of p n . It should 
be noted that there is another rank correlation coefficient - Spearman’s rank correlation 
coefficient, which we do not discuss in the present work. 

Basic properties of r n are as follows. If the agreement/disagreement between the sequence 
Xj jU and the rankings of Y[ J)Tl ] is perfect, the coefficient value r n is near 1/-1. Further, if 
variables X and Y are independent, then Yu n i j = 1 ,,n are independent and identically 
distributed. Obviously, P{Y^^ < Y^-^j) = 1/2 Vi, j, i ^ j. Then r n '—t 0. Different bounds 
for r n (basically in the normal case) were obtained in Daniels (1950), Durbin and Stuart 
(1951), Kendall (1970) and Xu et al. (2009); see also references in these works. 

Further in the paper, we discuss moment and asymptotic properties of r n . In Section 2, 
we fold Er n in general continuous case and show that it is free of n. In Section 3, we prove 
that in general continuous case the Kendall correlation coefficient converges in probability 
to its expected value. The last observation motivates us to introduce in Section 4 a new 
correlation coefficient: r = Er n . This correlation coefficient r is a possible alternative to 
the classical Pearson product-moment correlation coefficient p. It should be noted that no 
moment assumption is needed for the existence of r. We illustrate our theoretical results in 
Section 5 by examples and simulation results. 
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2 The expected value of r n 


We assume in this section that F is a continuous distribution. 
Lemma 2.1. For n > 2, we have 


Er n = 4 

/ F(dx, dy)F(du, dv) — 1 

(2.1) 


/ x<u,y<v 


= 4 

[ (1 -H(x)-G(y) + F(x,y))F(dx,dy)-l 

(2.2) 


Jm 2 


= 4 

[ F(x,y)F(dx,dy)-l = 4EF(X,Y)-l. 

Jr 2 

(2.3) 


Proof It is easily seen that (12.ip implies both (12.2|) and (12.31) . We present the proof (12.1|) 
for the case when F is an absolutely continuous distribution. Let 

Z[l,n] ^ [l,ra])) • • • j ^[n,n] (^n,nj 


One can find that 

fz [hn] ,...,z [n:n] (xi,yi,... ,x n ,y n ) =n\f(x 1 ,y 1 )...f(x n ,y n ) (x x < ... <x n: yi G M) (2.4) 
and 


fy\ 


[l,n] d [2 ,n] J'"d [77 


, (yi, V2, 


= n\ 


■ ■ -f{ x n,yn)dx 1 ...dx r . 


(2.5) 


' Xi<...<Xn 


By integration in (j2.5f) . one can find that for y,v G 

77/! 

f Y \i,n 1 dii.nl (y^ V ) ~ 


(* - i)Ki - 




Then 




7-1 


E r j,n — yi -P(yji,n] < ^h>]) ~ (a _ 


n\ 


i= 1 


X 


(i -2)!(n-j)! 

/(x, y)f{u , u)iF~ 2 (-u)(l — H (u)) n ~ j dxdudydv. 


* x<u,y<v 


We finish at the following identity 

4 Fr 7 n /• 

Er n = - j- -— 2 -1 = 4 / f(x,y)f(u,v)dxdydudv—l. (2.6) 

*) Jx<u,y<v 

The result readily follows. When F is continuous the result can be proved in the same 
manner. □ 

Remark 2.1. Since Er n is free of n, let r = Er n . 
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3 Asymptotic properties of r n 

We assume in this section that F is a continuous distribution. 
Theorem 3.1. The following asymptotic property holds true 

T n T. 


Proof We present the proof of Theorem 13.II for the case when F is an absolutely continuous 
distribution. It follows from Chebyshev’s inequality that for any e > 0 


P (| r n -r \> e) < 


V arr n 


16 


e 2 n 2 {n — l) 2 
By (12.61) , we have 


n j —1 n k— 1 / n j —1 \ ' 

E E E E w«,») - [e E E a*. 

3=2 i=l fc=2 J=1 \ J=2 i=l / 


£ E E ) = ( n ( n ~ l ) [ 

7=2 i=l / ^ Jx 


< u ,y<v 


f {x ,y)f { u Md u d , . 


Let 


(3.1) 


n j —1 n A:—1 n j — 1 n Ai—1 

X^. = XZ XZ XZ E E (^;3-n^l.k,n) = X^ XZ X^ XZ ^ "[?>]’ ^ L n ] — E \k,n})- 

j =2 i=l fc=2 (=1 j=2 i=l fc=2 Z=1 

One can write E = E x + E 2 , where 

= E + E + E 

l<i<j<l<k<n I< 2 <Z<J<A 1<71 1 <Z< 2 <J<A 1 <T 2 

E ~ + ~ E ~ + ~ E 

I<Z<A 1 < 2 <J<T 1 I<Z< 2 <A 1 <J <71 I< 2 <Z<A 1 <J <71 


e 2 


2 V +2 



+2 E 


l<i<j=l<k<n l<i=l<j<k<n l<i<l<j=k<n 


+4 E + 2 E +2 E +2 E 

l<i=l<j=k<n l<l<i<j=k<n 1 <Z<Ai=2<j<ti l<l=i<k<j<n 


n Ai—1 l—l j— 1 

yy = E E XZ X!/ — ^b>]> Y[l,n\ E 1 [fc,n]), 

I< 2 <J<Z<A 1<71 Ai=4 Z=3 j=2 2 — 1 


Here 
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n k —1 j —1 

XZ P ( y M — ^b>] — ^fa^]) 

l<i<j=l<k<n k=3 j =2 2=1 

and the other terms (sums) in Si and £2 are designated in the same fashion. We show in 
this section that the terms in Si behave like 0(n 4 ) when the terms in £2 behave like o(n 4 ). 
Let us start with S 2 . Let us, for example, show that — ^<j= l < k < n —0 as n —» 00 . It follows 
from (12.5ft that 


n\ 


n k —1 j —1 n k —1 j — 1 

5 ? ? P(y[i,nl “ yb>] “ Y[k ' n]) = ^ ? (i - 1)1(7 - * - l)!(fc - j - 1 )!(n - A:)! 

A/—3 j —2 2 — 1 fc—3 j —2 2 —1 


X 


Xi <xj ,yi<yj <yk 


H l 1 (x i )f(x i ,y i )(H(x j ) - H(xi)) 3 1 f(xj, y j ){H{x k ) - H(xj)) 


\k-j -1 


xf(x k ,y k )( 1 - H(x k )) n k dx i dy i dx j dy j dx k dy k 


= n(n — 1 )(n — 2) 

< n(n — l)(n — 2) = o(n 4 ). 


Xi<Xj<x kl yi<yj<y k 
4\ 


f(xi, yi)f(xj, yj)f(x k , y^dxidyidxjdyjdxkdyk 


One can show that the other terms of £2 behave like o(n 4 ) too. Let us now estimate the 

<i<j <l<k<n' 


terms in £ 1 . Let us take, for example, If follows from ( 12.51) that 


n k —1 l—l j —1 

yy = yy yy yy yy ^\jm ■> ^ ~ 

l<i<j<l<k<n k= 4 /=3 j=2 2=1 
n k —1 /—I j—1 


n! 


x 


^ S § S (* “ W - * - 1)!(^ - J - !) ! ( fc - l - l)K n - k)\ 

H l ~ 1 (x i )f(x i ,y i )(H(x j ) - H(x i )y~ i ~ 1 f(x j ,y j )(H(xi) - H{x j )) l ~ 3 ~ 1 


i xi<xj<xi<x k ,yi<yj ,yi<yk 


xf(x u yi)(H(x k ) - H(xi)) k 1 1 f(x k ,y k )(l - H(x k )) n k dx i dy i x j dy j dx x dyidx k dy k 


n—k , 


= n(n — 1 )(n — 2)(n — 3) 


T, 


>Xi<Xj<xi<x k ,yi<yj,yi<y k 


where 

T = f{xi, yi)f(xj, yj)f{x h yi)f(x k , y^dxidytXjdyjdxidyidxkdyk. 
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In the same way one can estimate the other terms in Si. Then 
^ = n(n — l)(n — 2 )(n — 3) 


x | / T-\- T+ T 

' Xi<Xj<Xi<x k ,yi<yj,yi<y k J Xi<xi<x j <x k ,y i <yj,y l <y k J xi<x i <x i <x k ,y i <y i ,yi<y k 

+ I T + f T+ f T ] + o(n A 

J xi<x k <xi<xj,yi<yj,yi<y k J xi<xi<x k <xj,yi<yj,yi<y k J xi<x k <xi<xj,yi<yj,yi<y k 


J xi^Xj )Xi<x k ,yi<yj ,yi<y k 

Obverve that 


T + o{n A ). 


T 


' Xi <Xj ,xi <x k ,yi <yj ,y t <y k 


x i <Xj ,X[ <x k ,yi <yj , yi <y k 


f(xi, y i )f(x j ,y j )f(xi, yi)f(x k , y^dxidyiXjdyjdxidyidxkdyk 


f(xi, yi)f(xj, yj)dxidyiXjdyjdx 


x% <Xj ,yi <yj 

It follows from (13.111 that 


P(| r n — t |> e) > 0 (n —> oo). 


The result readily follows. When F is continuous the result can be proved in the same 
manner. □ 


4 New correlation coefficient 

In continuous case, we propose to consider the value 

r = AEF(X, Y) — 1=4 I F(dx, dy)F(du, dv) — 1 = 4 / F(x, y)F(dx, dy) — 1 

Jx<u,y<v J R 2 

as a new theoretical correlation coefficient, which measures the rate of dependence between 
the random variables X and Y. Basic properties of r are as follows. 

Property 4.1. It is obvious that — 1 < r < 1. 

Property 4.2. It is easily seen that if X and Y are independent, then r = 0. 

Property 4.3. Let Y = <p(X), where (p is a nondecreasing function. Then r — 1. Let 
Y = tf(X), where if is a non increasing function. Then r = — 1. 
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Proof We prove only the first statement. The second statement can be proved in the 
same manner. We again assume that F is an absolutely continuous distribution. Let us 
consider the integral f R2 F Xj y(x, y)fx,y(x, y)dxdy, where the random variables X and Y are 
now attached to the corresponding bivariate distribution and density. It is obvious that 


P(x < X < x + dx, y <Y <y + dy ) ~ fx,Y( x , y)dxdy. 


Let Y — X. Then for all small enough dx and dy 


P(x < X < x + dx, y < X < y + dy) = 


It follows from the last identity that 


F x ,y(x, y)fx,v(x, y)dxdy = 


h x {x) min{dx, dy}, 

0, 


if y = x 
otherwise. 


H x [x)h x {x)dx 


1 / 2 . 


Jr 2 ,y=x Jr 

That way, if Y — X, then r = 4 EF{X,Y) — 1 = 1. The result readily follows since 
EF(X,p(X)) > EF(X, X). □ 


Observe that the corresponding Property 14.31 for p is as follows: p = ±1 iff Y — aX + b. 

In discrete case, one can define r by the identity 

r = 4 EF(X, Y)-1 = 4 EE f(j,k)F(j,k)- 1, 

3 k 

where f(j,k) = P(X = j,Y = k ) and F(j,k) = P(X < j,Y < k). One can check 
here the validity of Properties 14.1114.31 To check Property 14.11 one can apply the inequality 
h(j)H(j) < 1/2, where h(j) = P(X = j) and H(j) = P(X < j). 

In general case, when F has discrete and continuous components, one can define r as 
follows: r = 4 EF(X,Y) - 1. 


The proposed correlation coefficient r has some advantages and disadvantages in com¬ 
parison with the Pearson product-moment correlation coefficient p. 

Advantages. 

(1) We think that r reflects the rate of dependence between X and Y better than p since 
r is based on the whole information about a distribution. Observe that p is based only on 
the first and second moments. 

(2) No moment assumption is needed for the existence of r. 

(3) In continuous case, we have proved that r n A r. It is also true that p n A p. That 
way, both r and p are approximated for large values of n by r n and p n , respectfully. However, 
Er n = r. 






Disadvantages. 

(1) The Pearson product-moment correlation coefficient p is simpler than r. To compute 
p one should only know the first and the second moments. To compute r one should know the 
distribution F. In this respect, p has an advantage over r, because in many real experiments 
(say, in the time-series analysis) one often works only with first and second moments. 

(2) The Pearson product-moment correlation coefficient p is visibly presented in many 
important statistical models and formulas such as linear regression models, bivariate normals 
densities and so on. 


5 Examples 


Example 5.1. Let 


1 _ e ~x(y+lY 

F(x,y) = l-e~ x - (x > 0,y > 0,t > 0) 

(y + iy 

be a bivariate distribution with marginal distributions H(x) = 1 — e~ x (x > 0) and G(y) = 
1 — (y+iy (?/ > 0). It follows from \2.2 I) that for any t > 0 

r = Er n = —0.5. 


We made a corresponding simulation experiment, i.e. r n was computed many times for 
’’large” values of n by simulation in Matlab. The code for Kendall’s correlation coefficient is 


corrcoef(x, yf type ', Kendall). 


The experiment gave us the same value —0.5. 
Observe that here p = — (t > 2). 

Example 5.2. Let 


F(x,y) 


(y + !)* 


1 

(x + 1)* + 


1 

(x + y+ iy 


(x > 0, y > 0, t > 0) 


be a bivariate distribution with marginal distributions H(x) = 1 — (r + 1)t (x > 0) and G(y) = 
1 — hpppjt {y > 0). It follows from H2.2\) that for any t > 0 


T 2t + 1' 


A corresponding simulation experiment made for different values of t > 0 confirmed this 
result. 

Observe that here p—\ (t > 2). 
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Example 5.3. Let 

F(x, y) = xy{ 1 + a( 1 — x)(l — y)) (0 < x, y < 1, — 1 < a < 1) 

be a bivariate distribution with marginal distributions H(x) = x (0 < x < 1) and G(y) = 
y (0 < y < 1). It follows from H2.2 1) that for any a 

2 a 

T ~ IT' 

A corresponding simulation experiment made for different values of a confirmed this result. 
Observe that here p = f (—1 < ol < 1). 

The paper is submitted to a statistical journal. 
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